Introduction
The dominatR package provides a flexible suite of normalization methods for
transcriptomics data, including CPM, TPM, RPKM, min-max scaling, and quantile
normalization. It is compatible with common data structures such as matrix,
data.frame, and SummarizedExperiment, and is designed to streamline
preprocessing in bioinformatics workflows.
You can install the development version of ‘dominatR’ from GitHub:
# From GitHub (development version)
if (!requireNamespace("remotes", quietly = TRUE))
install.packages("remotes")
remotes::install_github("EthanCHEN6/dominatR_testing", force = TRUE)
This dataset contains RNA-Seq counts for airway smooth muscle cells, commonly used for gene expression analysis.
# Load required packages
library(dominatR)
library(SummarizedExperiment)
# Load airway dataset
library(airway)
data(airway)
airway_se <- airway
The airway dataset is structured as a SummarizedExperiment object:
Inspect the dataset:
airway_se
#> class: RangedSummarizedExperiment
#> dim: 63677 8
#> metadata(1): ''
#> assays(1): counts
#> rownames(63677): ENSG00000000003 ENSG00000000005 ... ENSG00000273492
#> ENSG00000273493
#> rowData names(10): gene_id gene_name ... seq_coord_system symbol
#> colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
#> colData names(9): SampleName cell ... Sample BioSample
assayNames(airway_se)
#> [1] "counts"
dim(assay(airway_se))
#> [1] 63677 8
head(assay(airway_se))
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 679 448 873 408 1138
#> ENSG00000000005 0 0 0 0 0
#> ENSG00000000419 467 515 621 365 587
#> ENSG00000000457 260 211 263 164 245
#> ENSG00000000460 60 55 40 35 78
#> ENSG00000000938 0 0 2 0 1
#> SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 1047 770 572
#> ENSG00000000005 0 0 0
#> ENSG00000000419 799 417 508
#> ENSG00000000457 331 233 229
#> ENSG00000000460 63 76 60
#> ENSG00000000938 0 0 0
It contains raw read counts for 63,677 genes across 8 samples.
Normalization is critical for correcting technical biases and enabling meaningful biological comparisons.
The package contains different normalization methods:
cpm_normalizationminmax_normalizationquantile_normalizationrpkm_normalizationtpm_normalizationLet’s explore the usage of each normalization method on the count data set previously described.
Min-Max normalization is a linear transformation technique that rescales each gene’s expression values to a specified range (typically [0, 1]). This normalization method is useful when you want to bring the data onto the same scale.
Function Purpose:
· Rescales each column to fit within a range [new_min, new_max].
· Preserves the relative structure of values within each column.
· Useful when different assays or samples have varying scales.
# Prepare input matrix
count_mat <- assay(airway)
# Apply min-max normalization
airway_minmax <- minmax_normalization(count_mat, new_min = 0, new_max = 1)
# Inspect structure
dim(airway_minmax)
#> [1] 63677 8
summary(as.vector(airway_minmax))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000000 0.0000000 0.0000000 0.0009679 0.0000274 1.0000000
head(airway_minmax[, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513
#> ENSG00000000003 0.0022792424 0.0017523136 1.699217e-03 0.0014897144
#> ENSG00000000005 0.0000000000 0.0000000000 0.000000e+00 0.0000000000
#> ENSG00000000419 0.0015676086 0.0020143784 1.208721e-03 0.0013327102
#> ENSG00000000457 0.0008727585 0.0008253084 5.119062e-04 0.0005988068
#> ENSG00000000460 0.0002014058 0.0002151278 7.785646e-05 0.0001277941
#> ENSG00000000938 0.0000000000 0.0000000000 3.892823e-06 0.0000000000
#> SRR1039516
#> ENSG00000000003 2.860799e-03
#> ENSG00000000005 0.000000e+00
#> ENSG00000000419 1.475649e-03
#> ENSG00000000457 6.159013e-04
#> ENSG00000000460 1.960829e-04
#> ENSG00000000938 2.513883e-06
Why Use Custom Ranges? You can set new_min = 10 and new_max = 20 if your downstream application prefers values in a different scale:
df_scaled <- minmax_normalization(count_mat, new_min = 10, new_max = 20)
head(df_scaled) # All columns now range from 10 to 20
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 10.02279 10.01752 10.01699 10.01490 10.02861
#> ENSG00000000005 10.00000 10.00000 10.00000 10.00000 10.00000
#> ENSG00000000419 10.01568 10.02014 10.01209 10.01333 10.01476
#> ENSG00000000457 10.00873 10.00825 10.00512 10.00599 10.00616
#> ENSG00000000460 10.00201 10.00215 10.00078 10.00128 10.00196
#> ENSG00000000938 10.00000 10.00000 10.00004 10.00000 10.00003
#> SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 10.02607 10.02033 10.01536
#> ENSG00000000005 10.00000 10.00000 10.00000
#> ENSG00000000419 10.01990 10.01101 10.01364
#> ENSG00000000457 10.00824 10.00615 10.00615
#> ENSG00000000460 10.00157 10.00201 10.00161
#> ENSG00000000938 10.00000 10.00000 10.00000
se <- airway
# Option A: Overwrite the default assay
se1 <- minmax_normalization(se)
head(assay(se1))
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513
#> ENSG00000000003 0.0022792424 0.0017523136 1.699217e-03 0.0014897144
#> ENSG00000000005 0.0000000000 0.0000000000 0.000000e+00 0.0000000000
#> ENSG00000000419 0.0015676086 0.0020143784 1.208721e-03 0.0013327102
#> ENSG00000000457 0.0008727585 0.0008253084 5.119062e-04 0.0005988068
#> ENSG00000000460 0.0002014058 0.0002151278 7.785646e-05 0.0001277941
#> ENSG00000000938 0.0000000000 0.0000000000 3.892823e-06 0.0000000000
#> SRR1039516 SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 2.860799e-03 0.0026074678 0.0020325525 0.0015356158
#> ENSG00000000005 0.000000e+00 0.0000000000 0.0000000000 0.0000000000
#> ENSG00000000419 1.475649e-03 0.0019898441 0.0011007460 0.0013637987
#> ENSG00000000457 6.159013e-04 0.0008243284 0.0006150451 0.0006147833
#> ENSG00000000460 1.960829e-04 0.0001568963 0.0002006156 0.0001610786
#> ENSG00000000938 2.513883e-06 0.0000000000 0.0000000000 0.0000000000
# Option B: Write to a new assay slot
se2 <- minmax_normalization(se, new_assay_name = "minmax_counts")
Example Output: For example, suppose column SRR1039508 originally contains gene expression values between 233 and 12890. After min-max normalization with new_min = 0, new_max = 1:
233 → mapped to 0.0
12890 → mapped to 1.0
For a gene in SRR1039508:
- The original expression value is 679.
- After Min-Max normalization, the expression level is rescaled to 0.0022 on
a scale of [0, 1]. This implies that 679 is very close to the lower end of
the range of expression values for this gene, meaning it’s likely among the
least expressed in the dataset for this sample.
Quantile normalization makes the distribution of values across all samples identical. This technique adjusts the data so that the rank distributions of the data across samples are equal.
## Apply quantile normalization
airway_quantile <- quantile_normalization(airway_se)
## Check result
dim(assay(airway_quantile))
#> [1] 63677 8
summary(as.vector(assay(airway_quantile)))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0 0.0 0.0 344.4 9.6 361483.1
head(assay(airway_quantile)[1:5, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 690.875 504.750 773.875 613.75 1010.000
#> ENSG00000000005 0.000 0.000 0.000 0.00 0.000
#> ENSG00000000419 468.875 582.375 550.625 552.00 516.875
#> ENSG00000000457 257.375 241.375 225.250 254.00 213.125
#> ENSG00000000460 58.000 65.250 31.500 53.75 67.750
Gene expression levels were converted to counts per million reads and log2 transformed.
Example Output:
For the first gene in SRR1039508, the normalized value is 690.875, which
places it at a higher rank relative to other samples. This suggests that after
normalization, this gene shows a higher expression across all samples.
This vignette demonstrates how to apply Counts Per Million (CPM) normalization using the cpm_normalization() function in the dominatR package. It supports matrix, data.frame, and SummarizedExperiment formats.
Function Purpose:
The cpm_normalization() function rescales raw count data such that each column sums to one million, optionally followed by a log2 transformation. This makes count data comparable across samples of different sequencing depths.
df <- assay(airway)
# Normalize without log2-transform
df_cpm <- cpm_normalization(df, log_trans = FALSE)
head(df_cpm[, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 32.900521 23.817776 34.43970525 26.906868 46.54699807
#> ENSG00000000005 0.000000 0.000000 0.00000000 0.000000 0.00000000
#> ENSG00000000419 22.628193 27.379809 24.49834703 24.071095 24.00974329
#> ENSG00000000457 12.598138 11.217747 10.37530639 10.815506 10.02110240
#> ENSG00000000460 2.907263 2.924057 1.57799337 2.308187 3.19039178
#> ENSG00000000938 0.000000 0.000000 0.07889967 0.000000 0.04090246
# Normalize with log2-transform
df_cpm_log <- cpm_normalization(df, log_trans = TRUE)
head(df_cpm_log[, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 5.083236 4.633302 5.1472947 4.802548 5.57128235
#> ENSG00000000005 0.000000 0.000000 0.0000000 0.000000 0.00000000
#> ENSG00000000419 4.562437 4.826793 4.6723318 4.647953 4.64441834
#> ENSG00000000457 3.765337 3.610906 3.5078335 3.562609 3.46219663
#> ENSG00000000460 1.966158 1.972346 1.3662486 1.726041 2.06708514
#> ENSG00000000938 0.000000 0.000000 0.1095607 0.000000 0.05783488
library(SummarizedExperiment)
# Apply in-place normalization (overwrite assay)
se1 <- cpm_normalization(airway, log_trans = FALSE)
head(assay(se1))
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 32.900521 23.817776 34.43970525 26.906868 46.54699807
#> ENSG00000000005 0.000000 0.000000 0.00000000 0.000000 0.00000000
#> ENSG00000000419 22.628193 27.379809 24.49834703 24.071095 24.00974329
#> ENSG00000000457 12.598138 11.217747 10.37530639 10.815506 10.02110240
#> ENSG00000000460 2.907263 2.924057 1.57799337 2.308187 3.19039178
#> ENSG00000000938 0.000000 0.000000 0.07889967 0.000000 0.04090246
#> SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 33.973415 40.259015 27.026857
#> ENSG00000000005 0.000000 0.000000 0.000000
#> ENSG00000000419 25.926226 21.802609 24.002873
#> ENSG00000000457 10.740401 12.182273 10.820193
#> ENSG00000000460 2.044246 3.973617 2.834985
#> ENSG00000000938 0.000000 0.000000 0.000000
# Save to a new assay slot
se2 <- cpm_normalization(airway, log_trans = TRUE, new_assay_name =
"cpm_logged")
head(assay(se2, "cpm_logged"))
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 5.083236 4.633302 5.1472947 4.802548 5.57128235
#> ENSG00000000005 0.000000 0.000000 0.0000000 0.000000 0.00000000
#> ENSG00000000419 4.562437 4.826793 4.6723318 4.647953 4.64441834
#> ENSG00000000457 3.765337 3.610906 3.5078335 3.562609 3.46219663
#> ENSG00000000460 1.966158 1.972346 1.3662486 1.726041 2.06708514
#> ENSG00000000938 0.000000 0.000000 0.1095607 0.000000 0.05783488
#> SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 5.128187 5.366637 4.808738
#> ENSG00000000005 0.000000 0.000000 0.000000
#> ENSG00000000419 4.750940 4.511127 4.644022
#> ENSG00000000457 3.553410 3.720527 3.563182
#> ENSG00000000460 1.606085 2.314295 1.939221
#> ENSG00000000938 0.000000 0.000000 0.000000
new_counts <- matrix(sample(1:100000, nrow(airway) * ncol(airway), TRUE),
nrow = nrow(airway))
rownames(new_counts) <- rownames(airway)
colnames(new_counts) <- colnames(airway)
assay(airway, "new_raw") <- new_counts
se3 <- cpm_normalization(airway, assay_name = "new_raw", new_assay_name =
"cpm_new_raw")
head(assay(se3, "cpm_new_raw"))
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 14.788145 3.498854 2.971905 6.333742 14.536293
#> ENSG00000000005 20.119063 27.425499 14.233632 28.947764 15.390852
#> ENSG00000000419 5.239054 9.119207 28.941853 9.729289 18.336495
#> ENSG00000000457 18.404801 17.627199 10.962249 15.116710 17.428310
#> ENSG00000000460 25.706696 5.218742 4.729612 13.200591 0.275027
#> ENSG00000000938 22.719872 21.485240 23.634589 5.353300 1.202028
#> SRR1039517 SRR1039520 SRR1039521
#> ENSG00000000003 15.461546 25.35930 10.972204
#> ENSG00000000005 31.251801 18.95559 9.578674
#> ENSG00000000419 18.533049 21.64368 13.116313
#> ENSG00000000457 24.701435 18.33327 18.638030
#> ENSG00000000460 26.392610 11.94689 26.177399
#> ENSG00000000938 4.932203 18.24914 7.054882
The output of cpm_normalization() depends on the input type:
· If you input a matrix or data.frame, it returns a numeric matrix where:
· Each column sums to 1,000,000 (unless you apply log transform).
· Row and column names are preserved.
· If you input a SummarizedExperiment, it returns the same SE object with:
· Either the original assay overwritten, or A new assay added (if new_assay_name is specified).
Example Output: For example, if a gene in SRR1039508 has an original expression value of 679, after CPM normalization, it might be scaled to 5.083236. This scaling reflects that, after adjusting for sequencing depth, the gene’s relative expression is lower when considering the total number of reads across the sample. This normalization ensures that the gene expression values are comparable across samples with different sequencing depths.
Reads per kilobase per million (RPKM) normalization adjusts for both gene length and sequencing depth, making it particularly useful for RNA-Seq data. RPKM helps compare gene expression levels across genes of different lengths.
## Calculate gene length
rowData(airway_se)$gene_length <- rowData(airway_se)$gene_seq_end -
rowData(airway_se)$gene_seq_start
## Apply RPKM normalization
airway_se_rpkm <- rpkm_normalization(airway_se, gene_length, log_trans = TRUE)
## Check the result
dim(assay(airway_se_rpkm)) # Check the dimensions
#> [1] 63677 8
summary(as.vector(assay(airway_se_rpkm))) # Summary statistics for all values
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000 0.0000 0.0000 0.3134 0.1280 13.3663
head(assay(airway_se_rpkm)[1:5, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 1.96574725 1.63406253 2.01510789 1.75562333 2.35376434
#> ENSG00000000005 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
#> ENSG00000000419 0.96736029 1.10825777 1.02446804 1.01161910 1.00976461
#> ENSG00000000457 0.35866816 0.32344632 0.30152042 0.31301889 0.29220123
#> ENSG00000000460 0.02168423 0.02180855 0.01181011 0.01724252 0.02377868
Example Output: For example, if a gene in SRR1039508 has an original expression value of 679, after RPKM normalization, it might be scaled to 1.96574725. This scaling reflects that, after adjusting for both gene length and sequencing depth, the gene’s relative expression is normalized, making it comparable across genes of different lengths.
Transcripts per million normalization.
## Calculate gene_length if not provided
rowData(airway_se)$gene_length <- rowData(airway_se)$gene_seq_end -
rowData(airway_se)$gene_seq_start
## Apply TPM normalization
airway_tpm <- tpm_normalization(airway_se, log_trans = TRUE)
## Check result
dim(assay(airway_tpm))
#> [1] 63677 8
summary(as.vector(assay(airway_tpm)))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000 0.0000 0.0000 0.7951 0.7534 16.1408
head(assay(airway_tpm)[1:5, 1:5])
#> SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516
#> ENSG00000000003 4.4294986 3.9577411 4.55434051 4.2249660 4.8647104
#> ENSG00000000005 0.0000000 0.0000000 0.00000000 0.0000000 0.0000000
#> ENSG00000000419 2.9549906 3.1678707 3.11233806 3.0989031 2.9884095
#> ENSG00000000457 1.5828547 1.4524115 1.44301228 1.4877425 1.3427320
#> ENSG00000000460 0.1467549 0.1443756 0.08513068 0.1237197 0.1553897
Example Output:x
For example, in SRR1039508, if the gene expression for a particular gene is
normalized to 4.43 TPM, this indicates that the gene’s expression represents
4.43 transcripts per million, accounting for both gene length and
sequencing depth.
Visual representation is essential for interpreting the structure, dominance, and variability of biological features across samples or conditions.
Our package offers a collection of entropy-based visualization functions designed for different analytical perspectives:
plot_circle()
Displays each sample’s entropy and average magnitude in a polar
coordinate layout.
plot_circle_frequency()
Summarizes the density of entropy-magnitude bins using circular heat
segments.
plot_abacus()
Groups features into quantile-based dominance categories and displays
them in an abacus-style panel plot.
plot_rope()
Compares two numeric vectors using a central “rope” layout to visualize
dominance asymmetry and entropy filtering.
plot_triangle()
Visualizes three variables in a ternary layout, highlighting balance or
dominance among triplets.
Let’s now explore each visualization function with real data examples.
This function visualizes high-dimensional input (e.g., gene expression matrix)
Using a polar coordinate system where: - Radial position represents Shannon entropy (distribution uniformity)
Angular position represents the dominant variable (feature with maximum value)
Point color represents either the dominant variable or an optional factor
This function is ideal for:
Visualizing multidimensional datasets (samples × features) in an interpretable 2D circular space.
Detecting samples/features with high entropy (irregularity) or high average expression.
Identifying mixed-behavior regions such as dense clusters or entropy-magnitude outliers.
Facilitating compact visualization across thousands of rows or columns.
Visualize genes from the airway dataset with specific filtering criteria.
Each point represents a gene, colored by the sample in which it has the highest
expression (dominant sample). We’ll highlight a specific gene of interest.
rowData(se)$gene_length <- rowData(se)$gene_seq_end - rowData(se)$gene_seq_start
se <- tpm_normalization(se, log_trans = TRUE, new_assay_name = "tpm_norm")
se <- se[1:1000, ]
#' Rename columns for consistency
colnames(se) <- paste('Column_', 1:8, sep ='')
plot_circle(
x = se,
n = 8,
entropyrange = c(0, 3),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm'
)
plot_circle(
x = se,
n = 8,
entropyrange = c(0, 1.5),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm'
)
plot_circle(
x = se,
n = 8,
entropyrange = c(2, 3),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm'
)
plot_circle(
x = se,
n = 8,
column_variable_factor = 'gene_biotype',
entropyrange = c(2,3),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm'
)
plot_circle(
x = se,
n = 8,
column_variable_factor = 'gene_biotype',
point_size = 3,
entropyrange = c(0,1.5),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm',
)
# Emphasize miRNA genes in orange
plot_circle(
x = se,
n = 8,
column_variable_factor = 'gene_biotype',
point_size = 3,
entropyrange = c(0,1.5),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
assay_name = 'tpm_norm',
point_fill_colors = c('protein_coding' = 'orange'),
point_line_colors = c('protein_coding' = 'orange')
)
se_result <- plot_circle(
x = se,
n = 8,
column_variable_factor = 'gene_biotype',
point_size = 3,
entropyrange = c(0,1.5),
magnituderange = c(0, Inf),
label = 'legend',
output_table = TRUE,
assay_name = 'tpm_norm',
point_fill_colors = c('protein_coding' = 'orange'),
point_line_colors = c('protein_coding' = 'orange')
)
The result is a list of two objects:
se_result[[1]]: a ggplot2 object for visualizationse_result[[2]]: a data.frame with entropy, magnitude, etc.se_result[[1]]
head(se_result[[2]])
#> Factor Entropy col rad deg
#> ENSG00000000005 protein_coding 0.0000000 Column_8 100.00000 -4.7123890
#> ENSG00000000938 protein_coding 0.9103799 Column_3 87.43514 -0.7853982
#> ENSG00000002726 protein_coding 0.0000000 Column_3 100.00000 -0.7853982
#> ENSG00000004809 protein_coding 0.0000000 Column_1 100.00000 0.7853982
#> ENSG00000004848 protein_coding 0.0000000 Column_7 100.00000 -3.9269908
#> ENSG00000004939 protein_coding 0.0000000 Column_7 100.00000 -3.9269908
#> x y labels rand_deg alpha
#> ENSG00000000005 -1.836970e-14 100.00000 Column_8 -4.7463259 1.0000000
#> ENSG00000000938 6.182598e+01 -61.82598 Column_3 -0.8193351 0.8862025
#> ENSG00000002726 7.071068e+01 -70.71068 Column_3 -0.8096388 1.0000000
#> ENSG00000004809 7.071068e+01 70.71068 Column_1 0.7611575 1.0000000
#> ENSG00000004848 -7.071068e+01 70.71068 Column_7 -3.8833576 1.0000000
#> ENSG00000004939 -7.071068e+01 70.71068 Column_7 -3.8833576 1.0000000
You can also use a raw matrix or data frame, such as one extracted from the assay slot of a SummarizedExperiment object:
#' First we extract the normalized data as a data.frame:
df <- assay(se, 'tpm_norm') |> as.data.frame()
colnames(df) <- paste('Column_', 1:8, sep ='')
plot_circle(
x = df,
n = 8,
entropyrange = c(0, 3),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE
)
# Genes with entropy between 2-3 (more balanced expression)
plot_circle(
x = df,
n = 8,
entropyrange = c(2, 3),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE
)
#' Genes with entropy between 0-2 (more specialized expression)
plot_circle(
x = df,
n = 8,
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE
)
plot_circle(
x = df,
n = 8,
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'curve',
output_table = FALSE
)
#' Emphasize expression dominance in Columns 1, 3, and 5
plot_circle(
x = df,
n = 8,
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
background_alpha_polygon = 0.2,
background_na_polygon = 'transparent',
background_polygon = c('Column_1' = 'indianred',
'Column_3' = 'lightblue',
'Column_5' = 'lightgreen'),
point_fill_colors = c('Column_1' = 'darkred',
'Column_3' = 'darkblue',
'Column_5' = 'darkgreen'),
point_line_colors = c('Column_1' = 'black',
'Column_3' = 'black',
'Column_5' = 'black')
)
# 1.2 Using factor variables
#' Add a factor column for grouping
set.seed(123) # For reproducibility
df$factor <- sample(c('A', 'B', 'C', 'D'), size = nrow(df), replace = TRUE)
plot_circle(
x = df,
n = 8,
column_variable_factor = 'factor',
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'legend',
output_table = FALSE,
background_alpha_polygon = 0.2,
background_na_polygon = 'transparent',
background_polygon = c('Column_1' = 'indianred',
'Column_3' = 'lightblue',
'Column_5' = 'lightgreen')
)
plot_circle(
x = df,
n = 8,
column_variable_factor = 'factor',
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'curve',
output_table = FALSE,
background_alpha_polygon = 0.02,
background_na_polygon = 'transparent',
point_fill_colors = c('A' = 'black',
'B' = 'gray',
'C' = 'white',
'D' = 'orange'),
point_line_colors = c('A' = 'black',
'B' = 'gray',
'C' = 'white',
'D' = 'orange')
)
#' When `output_table = TRUE`, returns a list containing:
#' 1. ggplot object
#' 2. Data frame with entropy, magnitude, and dominance information
plot_result <- plot_circle(
x = df,
n = 8,
point_size = 2,
column_variable_factor = 'factor',
entropyrange = c(0, 2),
magnituderange = c(0, Inf),
label = 'curve',
output_table = TRUE,
background_alpha_polygon = 0.02,
background_na_polygon = 'transparent',
point_fill_colors = c('A' = 'black',
'B' = 'gray',
'C' = 'white',
'D' = 'orange'),
point_line_colors = c('A' = 'black',
'B' = 'gray',
'C' = 'white',
'D' = 'orange')
)
# View plot
plot_result[[1]]
# View data
head(plot_result[[2]])
#> Factor Entropy col rad deg x
#> ENSG00000000005 C 0.0000000 Column_4 100.00000 -1.5707963 6.123234e-15
#> ENSG00000000938 B 0.9103799 Column_3 87.43514 -0.7853982 6.182598e+01
#> ENSG00000002587 A 1.5059369 Column_1 73.71299 0.7853982 5.212296e+01
#> ENSG00000002726 C 0.0000000 Column_3 100.00000 -0.7853982 7.071068e+01
#> ENSG00000003147 A 1.9042011 Column_7 60.81406 -3.9269908 -4.300204e+01
#> ENSG00000004809 D 0.0000000 Column_3 100.00000 -0.7853982 7.071068e+01
#> y labels rand_deg alpha
#> ENSG00000000005 -100.00000 Column_4 -1.5465556 1.0000000
#> ENSG00000000938 -61.82598 Column_3 -0.8096388 0.8862025
#> ENSG00000002587 52.12296 Column_1 0.7514612 0.8117579
#> ENSG00000002726 -70.71068 Column_3 -0.7417649 1.0000000
#> ENSG00000003147 43.00204 Column_7 -3.9124464 0.7619749
#> ENSG00000004809 -70.71068 Column_3 -0.8290314 1.0000000
The returned data frame (res[[2]]) contains the following columns:
(rad, deg),
used internally by geom_point().label = "legend"
or variables_highlight is set.set.seed()
beforehand).1 for highlighted points, otherwise
equal to your background_alpha_polygon setting).Dominant Sample: Shows which sample has the highest expression for each gene Useful for identifying sample-specific expression patterns
Radial Position: Genes near edge: Highly specific to one sample (low entropy) Genes near center: Similar expression across samples (high entropy)
Sector Position: Each wedge represents a sample Genes in a sample’s wedge have their highest expression in that sample
This function builds upon plot_circle() by stratifying samples into frequency bins and visualizing entropy-magnitude patterns for each bin separately. Useful when your dataset contains variables/features with different levels of occurrence or sparsity (e.g., expressed vs. non-expressed genes).
This function is ideal for:
Identifying highly prevalent genes/features across a cohort.
Screening for outlier or inactive variables.
Visually comparing distributions in a compact format.
# Data preprocessing
rowData(se)$gene_length <- rowData(se)$gene_seq_end - rowData(se)$gene_seq_start
se <- tpm_normalization(se, log_trans = TRUE, new_assay_name = 'tpm_norm')
se <- se[1:1000, ]
# Creating the circle plot data
# First we create the circle plot with output_table = TRUE to get
# the data needed for the frequency plot. We'll use gene biotype as our
# factor variable.
circle_data <- plot_circle(
x = se,
n = 8,
column_variable_factor = 'gene_biotype',
entropyrange = c(0, Inf),
magnituderange = c(0, Inf),
label = 'legend',
output_table = TRUE,
assay_name = 'tpm_norm'
)
freq_plot_default <- plot_circle_frequency(
n = 8,
circle = circle_data,
single = TRUE,
legend = TRUE,
numb_columns = 1,
filter_class = NULL,
point_size = 2
)
# Display the plot
freq_plot_default[[1]]
# View aggregated data
head(freq_plot_default[[2]])
#> bin Factor n proportion
#> 1 1 processed_transcript 1 0.1428571
#> 2 2 processed_transcript 0 0.0000000
#> 3 3 processed_transcript 0 0.0000000
#> 4 4 processed_transcript 1 0.1428571
#> 5 5 processed_transcript 0 0.0000000
#> 6 6 processed_transcript 0 0.0000000
# Visualize each factor level in separate panels
plot_circle_frequency(
n = 8,
circle = circle_data,
single = FALSE,
legend = TRUE,
numb_columns = 3, # Arrange in 3 columns
filter_class = NULL,
point_size = 2
)
#> $plot_stat
#>
#> $data
#> bin Factor n proportion
#> 1 1 processed_transcript 1 0.14285714
#> 2 2 processed_transcript 0 0.00000000
#> 3 3 processed_transcript 0 0.00000000
#> 4 4 processed_transcript 1 0.14285714
#> 5 5 processed_transcript 0 0.00000000
#> 6 6 processed_transcript 0 0.00000000
#> 7 7 processed_transcript 1 0.14285714
#> 8 8 processed_transcript 4 0.57142857
#> 9 1 protein_coding 71 0.07178969
#> 10 2 protein_coding 13 0.01314459
#> 11 3 protein_coding 13 0.01314459
#> 12 4 protein_coding 18 0.01820020
#> 13 5 protein_coding 28 0.02831143
#> 14 6 protein_coding 14 0.01415571
#> 15 7 protein_coding 37 0.03741153
#> 16 8 protein_coding 795 0.80384226
#> 17 1 pseudogene 1 0.25000000
#> 18 2 pseudogene 1 0.25000000
#> 19 3 pseudogene 0 0.00000000
#> 20 4 pseudogene 0 0.00000000
#> 21 5 pseudogene 2 0.50000000
#> 22 6 pseudogene 0 0.00000000
#> 23 7 pseudogene 0 0.00000000
#> 24 8 pseudogene 0 0.00000000
# Focus on specific gene biotypes
plot_circle_frequency(
n = 8,
circle = circle_data,
single = FALSE,
legend = TRUE,
numb_columns = 1, # Single column layout
filter_class = c('protein_coding', 'snoRNA', 'miRNA'),
point_size = 3 # Larger points for emphasis
)
#> $plot_stat
#>
#> $data
#> bin Factor n proportion
#> 9 1 protein_coding 71 0.07178969
#> 10 2 protein_coding 13 0.01314459
#> 11 3 protein_coding 13 0.01314459
#> 12 4 protein_coding 18 0.01820020
#> 13 5 protein_coding 28 0.02831143
#> 14 6 protein_coding 14 0.01415571
#> 15 7 protein_coding 37 0.03741153
#> 16 8 protein_coding 795 0.80384226
# Create a combined plot showing only selected classes
plot_circle_frequency(
n = 8,
circle = circle_data,
single = TRUE,
legend = TRUE,
numb_columns = 1,
filter_class = c('protein_coding', 'miRNA', 'lincRNA'),
point_size = 3
)
#> $plot_stat
#>
#> $data
#> bin Factor n proportion
#> 9 1 protein_coding 71 0.07178969
#> 10 2 protein_coding 13 0.01314459
#> 11 3 protein_coding 13 0.01314459
#> 12 4 protein_coding 18 0.01820020
#> 13 5 protein_coding 28 0.02831143
#> 14 6 protein_coding 14 0.01415571
#> 15 7 protein_coding 37 0.03741153
#> 16 8 protein_coding 795 0.80384226
# Create data.frame version
df <- assay(se, 'tpm_norm') |> as.data.frame()
colnames(df) <- paste('Sample', 1:8, sep = '_')
df$gene_biotype <- rowData(se)$gene_biotype
# Create circle plot data
circle_df <- plot_circle(
x = df,
n = 8,
column_variable_factor = 'gene_biotype',
entropyrange = c(0, Inf),
magnituderange = c(0, Inf),
label = 'legend',
output_table = TRUE
)
plot_circle_frequency(
n = 8,
circle = circle_df,
single = FALSE,
legend = TRUE,
numb_columns = 2,
filter_class = NULL,
point_size = 1.5
)
#> $plot_stat
#>
#> $data
#> bin Factor n proportion
#> 1 1 processed_transcript 1 0.14285714
#> 2 2 processed_transcript 0 0.00000000
#> 3 3 processed_transcript 0 0.00000000
#> 4 4 processed_transcript 1 0.14285714
#> 5 5 processed_transcript 0 0.00000000
#> 6 6 processed_transcript 0 0.00000000
#> 7 7 processed_transcript 1 0.14285714
#> 8 8 processed_transcript 4 0.57142857
#> 9 1 protein_coding 71 0.07178969
#> 10 2 protein_coding 13 0.01314459
#> 11 3 protein_coding 13 0.01314459
#> 12 4 protein_coding 18 0.01820020
#> 13 5 protein_coding 28 0.02831143
#> 14 6 protein_coding 14 0.01415571
#> 15 7 protein_coding 37 0.03741153
#> 16 8 protein_coding 795 0.80384226
#> 17 1 pseudogene 1 0.25000000
#> 18 2 pseudogene 1 0.25000000
#> 19 3 pseudogene 0 0.00000000
#> 20 4 pseudogene 0 0.00000000
#> 21 5 pseudogene 2 0.50000000
#> 22 6 pseudogene 0 0.00000000
#> 23 7 pseudogene 0 0.00000000
#> 24 8 pseudogene 0 0.00000000
Each arc segment represents:
A variable (e.g., gene), sorted by frequency.
Arc height indicates proportion of samples above threshold.
Useful for ranking and filtering in QC pipelines.
The returned table includes:
Variable: variable name (e.g., gene ID)
Proportion: % of samples with value above threshold
Threshold: cutoff used
Rank: position in sorted list
This function creates an abacus plot that classifies observations into dominance groups using entropy-based methods. It visualizes which features (e.g., genes) are dominant in which samples across different expression dominance categories.
This function is ideal for:
Classifying features into expression dominance categories (e.g., low, medium, high)
Identifying features that consistently dominate across multiple samples
Highlighting features of interest in the context of their dominance classification
Visualizing the distribution of dominance classes across samples in a compact format
se <- airway[1:1000, ]
rowData(se)$gene_length <- rowData(se)$gene_seq_end - rowData(se)$gene_seq_start
se <- tpm_normalization(se, log_trans = TRUE, new_assay_name = 'tpm_norm')
# Prepare data frame
df_abacus <- as.data.frame(assay(se, "tpm_norm"))
df_abacus$gene_id <- rownames(df_abacus)
df_abacus <- df_abacus[, c("gene_id", setdiff(colnames(df_abacus), "gene_id"))]
head(df_abacus[, 1:5])
#> gene_id SRR1039508 SRR1039509 SRR1039512 SRR1039513
#> ENSG00000000003 ENSG00000000003 10.020022 9.574117 10.022841 9.817259
#> ENSG00000000005 ENSG00000000005 0.000000 0.000000 0.000000 0.000000
#> ENSG00000000419 ENSG00000000419 8.417713 8.711587 8.468987 8.593571
#> ENSG00000000457 ENSG00000000457 6.668775 6.522540 6.329381 6.537197
#> ENSG00000000460 ENSG00000000460 2.679289 2.702931 1.929143 2.474782
#> ENSG00000000938 ENSG00000000938 0.000000 0.000000 1.111869 0.000000
# Generate plot with minimal parameters
abacus_res <- plot_abacus(
data = df_abacus,
n = ncol(df_abacus) - 1,
x_variable = "gene_id",
y_variables = colnames(df_abacus)[-1],
percentiles = 4,
title = "Gene Expression Dominance",
point_size = 2,
single = TRUE
)
abacus_res[[1]]
The call to plot_abacus(...) returns a list of two elements:
res[[1]]
A ggplot2 object: the abacus‐style dominance plot itself.
res[[2]]
A data.frame with one row per point drawn on the plot, containing:
X_axis
The identifier you passed as x_variable (e.g. gene ID).
Variable
The name of the variable (column) each point belongs to (one of
your y_variables).
Qentropy
The computed categorical entropy (Qentropy) for that feature–variable
combination.
bin
A factor giving the percentile bin (e.g. “0.25”, “0.50”, etc.) into
which that Qentropy falls.
This function compares two numeric vectors (e.g., expression in Condition A vs. B) using a “rope-like” 1D dominance visualization. Each sample is classified by its relative dominance, optionally filtered by entropy or magnitude thresholds.
This function is ideal for:
Comparing two groups of measurements across matched samples or features.
Detecting dominance shifts (e.g., gene up/down regulation between two conditions).
Filtering samples based on entropy or effect size before plotting.
## Data preparation
se <- airway[1:1000, ] # Subset for faster computation
rowData(se)$gene_length <- rowData(se)$gene_seq_end - rowData(se)$gene_seq_start
se <- tpm_normalization(se, log_trans = TRUE, new_assay_name = 'tpm_norm')
df <- as.data.frame(assay(se, 'tpm_norm'))
sample1 <- "SRR1039508"
sample2 <- "SRR1039516"
res_rope = plot_rope(
x = se,
column_name = c(sample1, sample2),
col = c('lightgreen', 'indianred'),
entropyrange = c(0, 0.1),
maxvaluerange = c(4, 8),
title = "SE Input: Low Entropy + Medium Expression"
)
res_rope = plot_rope(
x = se,
column_name = c(sample1, sample2),
col = c('lightgreen', 'indianred'),
entropyrange = c(0.1, 0.8),
maxvaluerange = c(4, 8),
title = "SE Input: Medium Entropy + Medium Expression"
)
res_rope = plot_rope(
x = se,
column_name = c(sample1, sample2),
col = c('lightgreen', 'indianred'),
entropyrange = c(0.8, 1),
maxvaluerange = c(4, 8),
title = "SE Input: High Entropy + Medium Expression"
)
res_rope = plot_rope(
x = se,
column_name = c(sample1, sample2),
output_table = TRUE,
col = c('lightgreen', 'indianred'),
entropyrange = c(0.8, 1),
maxvaluerange = c(4, 8)
)
str(res_rope)
#> 'data.frame': 1000 obs. of 7 variables:
#> $ a : int 679 0 467 260 60 0 3251 1433 519 394 ...
#> $ b : int 1138 0 587 245 78 1 6721 1424 820 658 ...
#> $ comx : num 0.2526 0 0.1139 -0.0297 0.1304 ...
#> $ comy : num 0.188 -0.053 -0.108 0.047 -0.097 ...
#> $ color : chr "whitesmoke" "whitesmoke" "whitesmoke" "whitesmoke" ...
#> $ maxvalue: int 1138 0 587 260 78 1 6721 1433 820 658 ...
#> $ entropy : num 0.953 0 0.991 0.999 0.988 ...
head(res_rope)
#> a b comx comy color maxvalue entropy
#> ENSG00000000003 679 1138 0.25261420 0.1880 whitesmoke 1138 0.9534655
#> ENSG00000000005 0 0 0.00000000 -0.0530 whitesmoke 0 0.0000000
#> ENSG00000000419 467 587 0.11385199 -0.1085 whitesmoke 587 0.9906294
#> ENSG00000000457 260 245 -0.02970297 0.0470 whitesmoke 260 0.9993635
#> ENSG00000000460 60 78 0.13043478 -0.0970 whitesmoke 78 0.9876925
#> ENSG00000000938 0 1 1.00000000 0.1270 whitesmoke 1 0.0000000
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
title = "Default Rope Plot"
)
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
title = "Custom Colors"
)
head(res_rope)
#> a b comx comy color maxvalue
#> ENSG00000000003 10.020022 10.4880532 0.022821808 0.1070 darkred 10.4880532
#> ENSG00000000005 0.000000 0.0000000 0.000000000 0.0025 darkred 0.0000000
#> ENSG00000000419 8.417713 8.4708971 0.003149127 0.1290 darkred 8.4708971
#> ENSG00000000457 6.668775 6.3104879 -0.027604619 -0.1120 darkgreen 6.6687755
#> ENSG00000000460 2.679289 2.7657567 0.015880043 0.1540 darkred 2.7657567
#> ENSG00000000938 0.000000 0.6916003 1.000000000 -0.1340 darkred 0.6916003
#> entropy
#> ENSG00000000003 0.9996243
#> ENSG00000000005 0.0000000
#> ENSG00000000419 0.9999928
#> ENSG00000000457 0.9994503
#> ENSG00000000460 0.9998181
#> ENSG00000000938 0.0000000
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
entropyrange = c(0, 0.1),
title = "Low Entropy Genes (0-0.1)"
)
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
entropyrange = c(0, 0.1),
maxvaluerange = c(2, Inf),
title = "Low Entropy + High Expression"
)
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
entropyrange = c(0, 0.1),
maxvaluerange = c(4, 8),
title = "Low Entropy + Medium Expression"
)
head(res_rope[[2]])
#> [1] 10.4880532 0.0000000 8.4708971 6.3104879 2.7657567 0.6916003
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
entropyrange = c(0.1, 0.8),
maxvaluerange = c(4, 8),
title = "Medium Entropy + Medium Expression"
)
res_rope = plot_rope(
x = df,
column_name = c(sample1, sample2),
col = c('darkgreen', 'darkred'),
entropyrange = c(0.8, 1),
maxvaluerange = c(4, 8),
title = "High Entropy + Medium Expression"
)
The call to plot_rope(...) returns a list of two elements:
res[[1]]: a ggplot2 object for the rope‐style dominance plot.
res[[2]]: a data.frame with one row per point drawn, containing:
a, b
The original values from each of the two input columns you passed (e.g.
the two TPM values).
comx, comy
The computed Cartesian coordinates for each point on the “rope”.
color
The fill color (as a string) actually used for that point.
entropy
The Shannon entropy score for that feature across all columns.
maxvalue
The mean (or maximum) expression value used to scale point size (or filter).
This function visualizes three-part compositions (e.g., condition A/B/C contributions) on a ternary plot. Useful when analyzing data with three mutually exclusive categories or proportions summing to one.
This function is ideal for:
Displaying relationships between three mutually exclusive components.
Exploring feature allocation among three sources or pathways (e.g., tissue A/B/C).
Identifying samples/features located at edge or center of triangular composition space.
## Minimal data preparation
se <- airway[1:1000, ] # Subset for faster computation
rowData(se)$gene_length <- rowData(se)$gene_seq_end - rowData(se)$gene_seq_start
se <- tpm_normalization(se, log_trans = TRUE, new_assay_name = 'tpm_norm')
df <- as.data.frame(assay(se, 'tpm_norm'))
samples <- c("SRR1039508", "SRR1039512", "SRR1039516")
res_rope = plot_triangle(
x = df,
column_name = samples
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue')
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(0, 0.4)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(0.4, 1.3)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(1.3, Inf)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(1.2, Inf),
maxvaluerange = c(2, Inf)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(1.2, Inf),
maxvaluerange = c(5, Inf)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(1.2, Inf),
maxvaluerange = c(10, Inf)
)
res_rope = plot_triangle(
x = df,
column_name = samples,
col = c('indianred', 'lightgreen', 'lightblue'),
entropyrange = c(1.2, Inf),
maxvaluerange = c(2, Inf),
plotAll = FALSE
)
res_rope = plot_triangle(
x = se,
column_name = samples,
col = c('darkred', 'darkgreen', 'darkblue'),
entropyrange = c(0, 0.4),
maxvaluerange = c(0.1, Inf),
assay_name = 'tpm_norm'
)
res_rope = plot_triangle(
x = se,
column_name = samples,
col = c('darkred', 'darkgreen', 'darkblue'),
entropyrange = c(0.4, 1.3),
maxvaluerange = c(0.1, Inf),
assay_name = 'tpm_norm'
)
res_rope = plot_triangle(
x = se,
column_name = samples,
col = c('darkred', 'darkgreen', 'darkblue'),
entropyrange = c(1.3, Inf),
maxvaluerange = c(0.1, Inf),
assay_name = 'tpm_norm'
)
triangle_data <- plot_triangle(
x = se,
column_name = samples,
output_table = TRUE,
entropyrange = c(1.3, Inf),
maxvaluerange = c(0.1, Inf),
assay_name = 'tpm_norm'
)
# View first 6 rows of the output data
head(triangle_data)
#> max_counts comx comy a b
#> ENSG00000000003 10.488053 -1.319600e-02 -0.007711040 0.3281926 0.3282850
#> ENSG00000000005 0.000000 0.000000e+00 0.000000000 0.0000000 0.0000000
#> ENSG00000000419 8.470897 -6.522825e-05 -0.002059715 0.3319602 0.3339822
#> ENSG00000000457 6.668775 8.474045e-04 0.018066565 0.3453777 0.3278004
#> ENSG00000000460 2.765757 -9.825204e-02 0.045000137 0.3633334 0.2616074
#> ENSG00000000938 1.111869 2.018128e-01 -0.500000000 0.0000000 0.6165167
#> c Entropy color
#> ENSG00000000003 0.3435224 1.5846272 darkblue
#> ENSG00000000005 0.0000000 0.0000000 whitesmoke
#> ENSG00000000419 0.3340576 1.5849564 darkblue
#> ENSG00000000457 0.3268219 1.5844933 darkred
#> ENSG00000000460 0.3750591 1.5674206 darkblue
#> ENSG00000000938 0.3834833 0.9604651 whitesmoke
max_counts The maximum normalized expression value (across your selected samples) for that feature.
comx comy
The x– and y–coordinates used to place that point inside the triangle.
color
Which of your provided colors was applied (one per sample), or whitesmoke
for filtered points. |
#> R version 4.4.3 (2025-02-28)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Sonoma 14.4.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/Chicago
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] ggplot2_3.5.2 airway_1.26.0
#> [3] SummarizedExperiment_1.36.0 Biobase_2.66.0
#> [5] GenomicRanges_1.58.0 GenomeInfoDb_1.42.3
#> [7] IRanges_2.40.1 S4Vectors_0.44.0
#> [9] BiocGenerics_0.52.0 MatrixGenerics_1.18.1
#> [11] matrixStats_1.5.0 knitr_1.50
#> [13] BiocStyle_2.34.0 dominatR_0.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] remotes_2.5.0 rlang_1.1.6 magrittr_2.0.3
#> [4] compiler_4.4.3 roxygen2_7.3.2 systemfonts_1.2.3
#> [7] vctrs_0.6.5 stringr_1.5.1 profvis_0.4.0
#> [10] pkgconfig_2.0.3 crayon_1.5.3 fastmap_1.2.0
#> [13] XVector_0.46.0 ellipsis_0.3.2 labeling_0.4.3
#> [16] promises_1.3.3 rmarkdown_2.29 sessioninfo_1.2.3
#> [19] tzdb_0.5.0 UCSC.utils_1.2.0 tinytex_0.57
#> [22] purrr_1.0.4 xfun_0.52 zlibbioc_1.52.0
#> [25] cachem_1.1.0 jsonlite_2.0.0 later_1.4.2
#> [28] DelayedArray_0.32.0 tweenr_2.0.3 R6_2.6.1
#> [31] bslib_0.9.0 stringi_1.8.7 RColorBrewer_1.1-3
#> [34] pkgload_1.4.0 lubridate_1.9.4 jquerylib_0.1.4
#> [37] Rcpp_1.1.0 bookdown_0.43 usethis_3.1.0
#> [40] readr_2.1.5 httpuv_1.6.16 Matrix_1.7-3
#> [43] timechange_0.3.0 tidyselect_1.2.1 rstudioapi_0.17.1
#> [46] abind_1.4-8 yaml_2.3.10 miniUI_0.1.2
#> [49] pkgbuild_1.4.8 lattice_0.22-7 tibble_3.3.0
#> [52] shiny_1.11.1 withr_3.0.2 evaluate_1.0.4
#> [55] desc_1.4.3 urlchecker_1.0.1 polyclip_1.10-7
#> [58] xml2_1.3.8 pillar_1.11.0 BiocManager_1.30.26
#> [61] generics_0.1.4 rprojroot_2.0.4 hms_1.1.3
#> [64] scales_1.4.0 xtable_1.8-4 glue_1.8.0
#> [67] tools_4.4.3 ggnewscale_0.5.2 forcats_1.0.0
#> [70] fs_1.6.6 grid_4.4.3 tidyr_1.3.1
#> [73] tidyverse_2.0.0 devtools_2.4.5 GenomeInfoDbData_1.2.13
#> [76] geomtextpath_0.1.5 ggforce_0.5.0 cli_3.6.5
#> [79] textshaping_1.0.1 S4Arrays_1.6.0 dplyr_1.1.4
#> [82] gtable_0.3.6 sass_0.4.10 digest_0.6.37
#> [85] SparseArray_1.6.2 htmlwidgets_1.6.4 farver_2.1.2
#> [88] memoise_2.0.1 htmltools_0.5.8.1 lifecycle_1.0.4
#> [91] httr_1.4.7 mime_0.13 MASS_7.3-65